Golang Job: SRE - Ansible on Cloud

Job added on

Company

Red Hat Software
Portugal

Location

Remote Position
(From Everywhere/No Office Location)

Job type

Full-Time

Golang Job Details

About the job:
The Red Hat Ansible Engineering (https://www.ansible.com/) team is looking for a highly motivated individual with a self-starter mentality to join our Managed Ansible on Cloud team as a Site Reliability Engineer. In this role, you will be working on a team of highly talented software engineers whose mission it is to grow the Red Hat cloud offering for the Ansible Automation Platform.

Job Summary

Using your expertise in SRE principles of automation and continuous improvement, you will help create an environment where availability, reliability, and security are incorporated through the entire application lifecycle, not treated as an afterthought. As an SRE, you will build tooling to automate the building, testing, deployment, promotion, monitoring, alerting, and maintenance of the Red Hat Ansible Managed Application on Azure.

You will get an opportunity to collaborate with diverse agile teams around the world to deliver value for our customers and partners in an open source way. This is also a great opportunity to hone your skills while working with a wide range of modern languages, frameworks, and technologies. As a Site Reliability Engineer, you will help bring a cloud-ready mentality to the Ansible organization. You will also become a part of Red Hat’s culture, which makes us unique in the industry. You will work with communities (f.e. https://www.ansible.com/community) passionate about open source software. This is a Europe based position. Successful applicants must reside in a country where Red Hat is registered to do business.

What you will do:
Develop and maintain software to automatically provision, upgrade, monitor, and heal Red Hat Ansible Automation Platform managed applications in Azure.
Write Ansible automation playbooks to reduce toil.
Support the operations of Red Hat Ansible Automation Platform by responding to and troubleshooting system alerts.
Perform root cause analysis on outages and work with Ansible Automation Platform engineering teams to improve the underlying product for the cloud offering.
Provide engineering support to Red Hat's global technical support team to resolve customer issues.
Participate in a global on-call rotation which could involve the occasional weekend or holiday.

What you will bring:
Passion for learning new technologies, building elegant software systems, troubleshooting complex technical issues, and automation.
Software development experience using Python or GoLang.
Linux administration experience. Red Hat Enterprise Linux (RHEL), CentOS, or Fedora are a plus.
Kubernetes administration experience.
Understanding of computer networking including DNS.
Basic knowledge of software development life cycle tools, like GitHub and Jenkins.
Software development life cycle (SDLC) and agile or scrum processes
Experience supporting a customer-facing service.
Basic knowledge of monitoring systems.
Good written and verbal communication skills in English.

Experience with the following is considered a plus:
Writing Ansible playbooks and administering the Red Hat Ansible Automation Platform
Familiarity with data center networking and routing protocols are a plus.
Cloud native development/administration experience (Azure preferred)
Operations experience with a production user-facing application
Prior experience working on a globally distributed, remote team
Operations support system (OSS) contribution
Microsoft Azure and Azure Resource Manager Templates